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1.  Early  Recognition  Experiments 


At  the  beginning  phase  of  the  project,  we  carried  out  a.  series  of  experiments  on  recognition 
using  conventional  techniques  including  K-L  decomposition  (eigenimage  approach)  and 
simple  template  matching.  We  also  tested  configuration-independent  techniques  with 
target  dimensions.  These  results  were  intended  for  the  understanding  of  synthetic  aperture 
radar  automatic  target  recognition  (SAR  ATR)  and  for  comparisons. 

A  set  of  200  chips  for  each  of  the  3  targets  (BTR70.  BMP2.  and  T72)  are  used  in  the 
experiments.  The  chips  are  sorted  by  target  azimuth:  odd  chips  are  used  for  training  and 
even  chips  are  used  for  testing.  The  peak  detection  module  and  the  segmentation  module 
described  in  [lj  are  applied  to  the  600  chips  to  obtain  20  by  40  subimages  or  regions  of 
interest  (ROI)  containing  the  targets.  For  chips  that  the  segmentation  module  fails  to  find 
the  ROI.  we  segment  the  chip  manually.  For  each  ROI,  we  also  extract  a  4  by  40  subirnage 
that  contains  only  the  longer  leading  surface  (LLS).  LLS  is  defined  as  the  longer  side  of  the 
vehicle  facing  the  radar.  See  Figure  1. 


Figure  1:  ROI  and  LLS  Subimages  Extracted  from  a  Chip  for  Training  and  Testing 

In  the  experiment  using  K-L  decomposition,  a  covariance  matrix  is  computed  from  the 
training  images.  The  10  eigenvectors  of  the  matrix  with  the  largest  eigenvalues  constitute 
the  feature  space  axes.  When  a  test  image  is  presented  to  the  classifier,  the  estimated 
target  azimuth,  a,  is  used  to  select  K  training  images  (from  each  target  class)  that  have 
the  closest  azimuth  to  a  to  construct  a  template.  Also,  K  training  images  (from  each 
target  class)  that  have  the  closest  azimuth  to  a  ±  180°  are  selected  to  construct  an  180° 
alternative  template.  In  other  words,  there  are  2 N  candidate  templates  for  every  test 
image  to  compare  with,  where  N  is  the  number  of  candidate  target  classes.  In  the 
preliminary  result  presented  below,  we  use  K  =  3  and  N  =  3. 

To  classify  a  test  image,  the  test  image  and  the  2N  templates  are  projected  onto  the 
feature  space  and  the  template  that  has  the  smallest  Euclidean  distance  to  the  test  image 
determines  the  class  tag  and  the  pose  of  the  test  image.  In  addition  to  using  the  whole 
ROI  to  train  and  test  the  classifier,  we  also  use  the  LLS  subimages  for  training  and  testing 
The  reason  for  using  LLS  subimages  is  that  they  are  likely  to  be  configuration  independent 
Confusion  matrices  are  shown  in  Table  1  and  Table  2.  Note  that  our  experiment  differs 
from  other  similar  experiments  [2]  in  two  aspects.  First,  we  use  segmented  ROIs.  And 
second,  we  dynamically  construct  azimuth-dependent  templates  for  classification. 
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Azimuth-dependent  templates  can  account  for  the  great  variability  of  SAR  imagery  and, 
therefore,  improve  recognition  rate. 


Table  1:  Recognition  Using  ROls 
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Recognition  based  on  cross  correlation  shows  comparable  results  as  K-L-based  recognition 
(Table  3).  As  in  the  case  of  K-L-based  recognition,  each  test  image  will  have  six  templates 
to  compare  with.  For  each  template  we  use  leading  surfaces  and  peaks  to  find  possible 
alignment  of  the  test  image  and  the  template  as  in  Section  .  The  alignment  that  gives  the 
best  cross  correlation  will  be  the  score  of  the  template,  and  the  template  with  the  highest 
score  determines  class  tag  and  pose  of  the  test  target.  Cross  correlation  of  two  images,  f 
and  g,  is  computed  using  the  following  expression: 


Cross  correlation  = 


X/m,rz  ( fm,n  f )  '  (§7n,n  d) 

Of  os 


(1) 


Table  3:  Recognition  Using  Cross  Correlation 
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Observed  length  and  width  of  a  ROI  can  also  be  used  for  recognition,  assuming  no  or  little 
occlusion.  For  most  target  classes,  recognition  using  length  and  width  is  independent  of 
configuration  changes  on  the  deck  of  a  target.  We  construct  from  the  training  images 
piecewise  linear  functions  to  approximate  the  length  and  width  of  a  target  as  target 
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azimuth  varies.  An  example  is  shown  in  Figure  2.  The  product  of  Gaussian  likelihoods  is 
used  as  a  score  for  classification.  In  expression  (2),  l0  and  w0  are  the  observed  length  and 
width  of  the  ROI,  respectively;  k,Q  and  wiM  are  the  stored  length  and  width  of  the  ith 
target  class  at  azimuth  =  a.  Table  4  shows  the  confusion  matrix  of  our  experiment.  Note 
that  cr  =  l  pixel  is  used. 
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Figure  2:  A  Piecewise  Linear  Function  Constructed  from  the  Training  Images  to  Approxi¬ 
mate  the  Length  of  the  BTR70  at  Different  Azimuth 


Table  4:  Recognition  Using  Length  and  Width 
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2.  Structural  Model-Based  Recognition  Approach 


Statistical  classification,  view-based  recognition,  and  model-based  recognition  are  the 
techniques  commonly  seen  in  the  computer  vision  literature.  A  structural  model-based 
approach  is  adopted  in  this  research  effort. 

Because  of  the  great  variability  of  SAR  imagery,  orientation-dependent  templates  or 
models  are  often  used  for  recognition,  regardless  of  which  technique  is  adopted.  Section 
describes  how  a  orientation-dependent  model  is  selected  in  our  approach. 

Statistical  classification  is  computationally  tractable.  However,  our  early  experiments  with 
statistical  classification  (K-L  decomposition;  eigenimage  approach)  [2]  achieved  only  about 
a.n  80  percent  success  rate  in  recognition. 

The  view-based  approach  matches  the  input  image  to  every  possible  view  of  all  target 
classes.  It  has  two  fundamental  limitations:  1)  It  lacks  a  good  indexing  capability  that 
selects  a  few  candidate  target  classes  from  the  entire  database  for  matching,  and  2)  It  is 
computationally  prohibitive  to  maintain  a  set  of  views  that  covers  a  target  under  the 
extended  operating  conditions  (EOCs).  Take  the  T72  tank,  for  example.  With  the  hatch 
open  or  not,  oil  tanks  or  not  will  result  in  different  images  [3].  In  other  words,  recognition 
with  the  view-based  approach  becomes  a  combinatorial  problem. 

There  are  a  few  traits  of  a  model-based  recognition  system  including  low  level  (image  level) 
feature  detection /extraction,  generic  hypothesis  generation,  indexing  at  various  levels  of 
the  system,  and  matching.  It  is  tractable  and  intuitive  to  humans  to  integrate  a  probability 
framework,  known  as  Bayesian  network,  to  a  model-based  system  to  make  decisions  at 
various  levels  (e.g.,  image  level,  component  level,  and  object  level)  of  the  system. 


2.1  Generic  Vehicle  Model 


We  adopt  a  generic  vehicle  model  unique  in  the  ATR  community  to  define  parameters  of 
length,  width,  and  orientation  to  use  in  recognition.  Observing  that  most  ground  vehicles 
of  interest  have  a  rectangular  chassis,  it  is  natural  to  fit  a  rectangular  bounding  box  to  a 
set  of  image  features  to  estimate  a  pair  of  leading  surfaces  that  in  turn  give  the  estimated 
length,  width,  and  orientation.  The  rectangular  bounding  box  model  can  be  extended  to 
include  articulating  parts  that  accommodate  features  extruding  outside  of  the  box,  such  as 
a  gun  or  antenna. 

The  estimated  leading  surfaces  provide  a  coordinate  frame  (called  target  coordinate  frame) 
local  to  the  target.  With  the  strong  assumption  that  amplitudes,  and  relative  positions  of 
enough  peaks  (bright  points)  in  the  image  are  quasi-invariant  for  a  small  range  of  target 
orientation  (with  respect  to  the  radar)  and  variations  in  radar  viewing  parameters,  target 
coordinate  frames  can  be  used  to  align  images  for  matching  individual  image  features. 
Typically,  10  to  15  peaks  that  persist  for  about  3  to  5  degrees  is  a  sufficient  condition  for 
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Longer  leadi  ng 


SAR 


Figure  3:  Generic  Vehicle  Model  (Top  View) 


the  quasi-invariance  assumption;  this  condition  is  verified  in  this  report  (Section  ). 


2.2  System  Overview 

Figure  4  gives  an  overview  of  our  SAR  ATR  system.  The  system  has  two  phases  -  the 
hypothesis  generation  phase  that  selects  a  few  candidate  target  classes  from  the  target 
database,  and  the  hypothesis  verification  phase  that  outputs  the  best  class  and  pose  for  the 
test  image. 

The  hypothesis  generation  phase  starts  with  estimating  low-level  image  features  (e.g., 
peaks),  finding  dusters  of  peaks  that  correspond  to  a  target  area.  The  generic  vehicle 
model  is  fitted  to  the  target  peaks  to  get  target  length,  width,  and  orientation  estimates. 
Target  peaks  and  the  estimated  parameters  are  used  in  target  indexing. 

The  hypothesis  verification  phase  aligns  candidate  class  images  to  the  test  image,  and  this 
process  is  sped  up  with  the  help  of  the  Delaunay  walk  algorithm.  The  Evaluation  module 
selects  the  best  class  and  pose  for  the  test  image. 


o 


System  Overview 


Test  image 


i  Candidates 


Generation  phase:  Verification  phase: 

■  Low4evel  feature  estimation.  ■  Image  alignment. 

e.g.,  peaks  and  edgels  find  feature  correspondences 

•  Segmentation/  Extended  ■  Delaunay  walk, 
feature  estimation.  sPeed  up  ima9e  alignment 

e.g.,  find  clusters  of  target  peaks  *  Hypothesis  evaluation. 

■  Generic  model  estimation.  select  the  best  match  from 

length,  width,  and  orientation  candidates 

■  Target  indexing. 

compute  candidates 

Figure  4:  System  Overview 
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3.  Persistent  Scattering 


Because  of  the  great  variability,  doing  recognition  with  SAR  image  features,  such  as  peaks 
(bright  points)  was  once  considered  impossible.  In  recent  years,  researchers  have 
demonstrated  that  peaks  that  are  persistent  over  a  few  degrees  of  the  target  orientation 
(Figure  5)  with  respect  to  the  radar  are  sufficient  for  recognition. 

Nadir  Track 

▲ 


Target 


Figure  5:  Target  Orientation  (Top  View) 

Persistent  scattering  of  inverse  SAR  (ISAR)  and  synthesized  SAR  (XPATCH)  imagery 
were  studied  by  Dudgeon  et  al.  [4]  and  Binford  et  al.  [5]  respectively.  In  this  report,  we 
study  persistence  of  the  MSTAR  (Moving  and  Stationary  Target  Acquisition  and 
Recognition)  imagery.  The  results  give  evidence  that  there  are  enough  scatterers  with 
sufficient  persistence  for  structural  model-based  recognition. 

We  study  persistent  scatterers  in  a  set  of  MSTAR  images  of  three  different  vehicles 
(BTR70,  BMP2,  and  T72).  The  evaluation  of  persistent  scattering  was  done  in  three  ways: 
by  human  observers,  by  an  interactive  user  interface  and  a  human  operator,  and  by  an 
automated  program.  A  user  interface  or  an  automated  program  is  needed  because  we  do 
not  have  the  ground  truth  for  registering  images  as  in  the  case  of  synthesized  SAR  or  ISAR 
imagery. 

Images  of  targets  were  examined  and  compared  to  the  miniature  vehicle  models  we 
assembled  to  determine  the  salient  scatterers  and  the  azimuth  intervals  over  which  they  are 
visible.  For  example,  one  of  the  BTR70  hatches  is  visible  over  the  azimuth  intervals:  0  to 
15,  35  to  50,  85  to  110,  and  150  to  175  degrees. 


3.1  Peak  Detection 


Image  peaks  (bright  points)  are  the  primary  low-level  image  features  used  in  our  current 
SAR  ATR  system.  Peaks  on  the  target  area  are  strong  radar  returns  from  scatterers  on  the 
targets.  Corner  reflectors,  dihedrals,  and  planes  are  commonly  seen  simple  scatterers. 
Different  scatterers  have  different  stability;  for  example,  a  corner  reflector  is  more  stable 
than  a  plane  scatterer  in  that  energy  is  more  likely  to  be  reflected  back  from  a  corner 
reflector. 
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Figure  6:  Detected  Peaks 


Figure  7:  Left:  Superposition  of  Rotated  Peaks;  Right:  Superposition  of  Rotated  Intensity 
linages 


A  generic  peak  detector  by  Wang  and  Binford  [1]  is  used  to  detect  image  peaks.  Figure  6 
shows  an  example  of  detected  peaks  overlaid  on  the  original  intensity  image.  Peaks  belong 
to  targets  are  segmented  (selected)  before  persistent  scattering  is  studied. 


3.2  Peak  Persistence 

The  peak  detector  [1]  was  applied  to  an  image  to  extract  peaks  (bright  points)  in  the 
image.  The  set  of  peaks  were  then  rotated  to  zero  azimuth  angle  and  an  interactive  user 
interface  was  subsequently  used  to  register  them  to  a  set  of  reference  peaks.  This  process 
was  repeated  for  images  with  target  azimuth  between  45  and  135  degrees.  Figure  7  shows 
the  superposition  of  the  rotated  peaks  and  intensity  images  of  the  BTR70. 

To  automate  the  registration  process,  we  developed  a  method  based  on  leading  surfaces 
and  ideogram.  We  use  leading  surfaces  to  establish  local  coordinate  frames  for  the  two  sets 
of  peaks  being  registered.  We  can  use  a  pair  of  peaks,  one  peak  from  each  frame,  that  have 
the  same  local  coordinates  for  registration  (Figure  8(a)).  However,  some  peaks  may  not 
have  the  corresponding  peaks  in  the  other  frame  due  to  the  great  variability  of  SAR 


imagery.  Therefore,  we  chose  to  generate  correspondence  hypotheses  for  the  2  frames  using 
the  10  brightest  peaks  and  select  hypothesis  with  largest  overlap,  where  overlap  is 
measured  by  ideogram  (Figure  8(b)).  The  ideogram  counts  the  number  of  matched  peak 
pairs  and  is  expressed  as  follows: 


a~ 

Ideogram  =  5Z 


(3) 


Figure  8:  (a)  Local  Coordinate  Frames  Used  to  Establish  Correspondence  of  Peaks;  (b) 
Ideogram  Associated  with  the  ith  Peak  Pair 

Since  we  have  established  the  correspondence  of  peaks  from  one  frame  to  the  next,  we  can 
generate  plots  of  average  number  of  peaks  per  frame  versus  minimum  persistence  (in 
degrees)  and  plots  of  number  of  peaks  that  persist  for  more  than  1,  10,  or  20  degrees  versus 
target  azimuth  (Figure  9).  These  kinds  of  plots  were  introduced  by  Dudgeon  et  al.  [4]  for 
evaluating  persistent  scattering  in  ISAR  imagery.  The  plots  show  that  on  the  average  there 
are  10  peaks  that  will  persist  for  more  than  15  degrees. 

Existence  of  persistent  scattering  is  the  basis  for  structural  model-based  target  recognition 
with  SAR  imagery.  Matching  of  individual  scatterers  (target  features)  enables  recognition 
under  articulation,  obscuration,  and  configuration  changes  of  targets. 

With  enhanced  image  alignment  technique,  we  have  improved  the  ability  of  our  automatic 
programs  to  better  tracking  of  target  scatters  across  image  frames.  Figure  10  (a)  shows  the 
superposition  of  two  aligned  peak  images.  Figure  10  (b)  shows  the  superposition  of  21  peak 
images,  all  registered  to  the  11th  peak  image.  Note  that  the  effect  of  range  foreshortening 
is  corrected  and  all  of  the  peak  images  are  rotated  to  zero  azimuth. 

We  use  a  set  of  231  BTR70  images  in  our  study  of  persistent  scattering  with  automated 
programs.  The  result  shows  that  on  the  average  there  are  10  peaks  that  persist  for  more 
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(a)  Average  number  of  peaks  per  frame 
versus  minimum  persistence 


(b)  Number  of  peaks  that  persist  for  more 
than  1,  10,  or  20  degrees  versus  target 
azimuth 


Figure  9:  Persistent  Scattering 
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(a)  superposition  of  2  BTR70  peak  im-  (b)  superposition  of  21  BTR70  peak  im¬ 
ages  ages 

Figure  10:  Alignment  of  Peak  Images 


than  15  consecutive  degrees.  This  characterization  underestimates  somewhat  the 
persistence  of  scatterers  that  are  obscured  at  some  angles  and  then  reappear.  Nevertheless, 
it  gives  evidence  that  there  are  enough  scatterers  with  sufficient  persistence  for  structural 
model-based  recognition. 
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(a)  A  tophat-like  scatterei  (b)  A  dihedral-like  scatterei 
Figure  11:  Amplitude  Variation  of  Peaks 


We  also  studied  the  variation  of  peak  amplitude  as  a  function  of  azimuth  angle  (Figure  11). 
The  preliminary  result  shows  that  it  may  be  possible  to  classify  target  peaks  into  a  handful 
of  categories.  Knowing  the  category  of  a  peak  allows  us  to  relate  it  back  to  the  physical 
component  of  the  target  that  reflected  the  radar  energy.  Also,  statistics  of  appearance  of 
each  peak  category  can  be  collected  for  a  given  target  from  the  training  imagery. 
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4.  Target  Segmentation 


We  use  MSTAR  target  chips  for  the  development  and  testing  of  our  ATR  algorithms.  We 
segment  out  targets  before  performing  estimation  of  parameters  (such  as  length,  width, 
and  orientation)  of  a  generic  vehicle  model. 

The  segmentation  module  is  a  reimplementation  of  the  segmentation  technique  developed 
by  Wang  and  Binford  [1]  for  SAR  imagery.  The  technique  involves  peak  detection,  peak 
selection,  Delaunay  tri angulation,  and  breaking  long  links  in  the  triangulations.  Failures  of 
peak  detection  and  target  segmentation  were  analyzed.  Plans  have  been  made  for 
improving  performance. 

The  peak  detector  developed  by  Wang  and  Binford  is  used  to  estimate  position,  amplitude, 
and  widths  of  strong  radar  returns  in  the  input  image.  Thresholds  for  peak  amplitude  are 
set  to  select  strong  peaks  possibly  corresponding  to  target  scatterers  since  amplitude  of  the 
returned  radar  signal  is  probably  the  best  quantity  for  discriminating  targets  against 
clutter.  The  Delaunay  triangulation  of  the  strong  peaks  is  computed.  Long  links  in  the 
triangulation  are  broken  to  segment  out  target  areas  because  the  density  of  strong  peaks 
are  low  in  the  nontarget  area. 
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(a)  Intensity  image  of  a  BTR70  (b)  Strong  peaks  (target)  and  their  tri¬ 

angulation 


(c)  Weak  peaks  (shadow)  and  their  tri-  (d)  Segmented  target  and  shadow  ar- 
angulation  eas 


Figure  12:  Segmentation 

Figure  12  (a)  shows  the  input  intensity  image  of  a  BTR70.  The  peak  detector  [1]  is  applied 
to  extract  peaks  (bright  points)  in  the  image.  Figure  12  (b)  shows  the  Delaunay 
triangulation  of  strong  peaks  with  amplitude  greater  than  2.0.  These  peaks  correspond  to 
the  class  of  target  scatterers.  After  breaking  long  links  (>  10  pixels)  in  the  Delaunay 
triangulation,  a  few  groups  of  connected  strong  peaks  are  formed.  These  groups  can  be 
further  examined  and  ranked  using  the  prior  knowledge  of  average  peak  amplitude  and 
target  sizes.  Other  types  of  objects  can  be  segmented  in  the  same  way.  For  example, 

Figure  12  (c)  shows  the  Delaunay  triangulation  of  the  weak  peaks  that  possibly  correspond 
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to  shadows.  Figure  12  (d)  demonstrates  segmented  target  and  shadow  areas. 
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5.  Higher  Order  Features 


To  go  beyond  using  point  features  (e.g.,  image  peaks)  for  indexing  and  matching,  higher 
order  features  such  as  pairs  of  peaks,  line  features,  and  boundaries  can  be  used.  This 
section  summarizes  our  preliminary  results  on  finding  line  features  and  shadow  boundaries. 
Line  features  are  observable  in  almost  all  ground  targets,  for  example,  the  leading  surfaces, 
the  gun  of  a  T72,  and  the  shadow  of  the  gun.  Shadow  boundaries  contain  information  not 
available  from  the  target  area,  for  example,  the  height  of  the  target,  and  the  aim  angle  of 
the  gun.  A  more  detailed  analysis  should  also  take  into  account  the  effect  of  layover. 

The  Binford-Chiang  edge  operator  is  first  applied  to  extract  edgels  (edge  elements)  and 
these  edgels  are  linked  into  higher  order  features.  Delaunay  triangulation  is  used  as  a 
means  of  spatial  indexing  that  establishes  a  neighborhood  for  each  edgel:  in  other  words, 
the  neighboring  edgels  in  the  Delaunay  triangulation  are  candidates  for  linking. 


5.1  Line  Features 

Figure  13  shows  a  T72  image.  Figures  14(a)  and  14(b)  show  the  detected  delta  edgels  and 
step  edgels,  respectively.  The  example  here  focuses  on  linking  step  edgels. 

First,  edgels  are  detected  using  the  Binford-Chiang  edge  operator.  To  link  the  edgels, 
direct  neighbors  in  the  Delaunay  triangulation  (Figure  14(c))  are  used  as  candidates.  The 
best  candidate  is  selected  based  on  a  probability  measure,  and  this  process  is  continued 
until  the  probability  is  smaller  than  a  predetermined  threshold.  Figure  14(d)  shows  the  line 
features  with  at  least  two  step  edgels.  As  can  be  seen,  step  edges  that  correspond  to  the 
gun,  shadow  of  the  gun.  and  part  of  the  leading  surfaces  are  good  line  features. 


Figure  13:  T72  Image 
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(a)  Delta  edgeb 


(b)  Step  edgel.s 


(c)  Delaunay  triangulation  of  step  edgels 


(d)  Line  features 


Figure  14:  Finding  Line  Features 


To  assess  the  goodness  of  fit.  the  line  defined  by  the  first  edgel  and  the  last  edgel  is  used  as 
a  reference  line.  The  deviation  of  the  tangent  of  each  edgel  from  the  reference  line  is 
modeled  by  a  normal  distribution  with  zero  mean,  and  standard  deviation  a.  In  other 
words,  tj  ~  Ar(0,  a)  (Figure  15).  To  allow  more  expected  variation  with  a  larger  number  of 
edgels  on  the  line,  Chi-square  distribution  is  used  to  evaluate  the  overall  goodness  of  fit  of 
the  edgels  to  the  line.  £"=i(^)2  ~  X2  with  degrees  of  freedom  =  n.  Other  edgel  features 
such  as  amplitude  (intensity),  contrast,  and  curvature  can  be  incorporated  to  improve 
decision. 
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Figure  15:  Probability  Model 


5.2  Shadow  Boundaries 


A  maximum  likelihood  decision  method  by  Oliver  et.  al.  [6]  is  adopted  to  select  candidate 
edgels  that  are  likely  to  be  on  the  shadow-background  boundaries  from  the  detected  step 
edgels.  Then,  a  hierarchical  linking  strategy  is  used  to  link  the  candidate  edgels  into  a 
boundary  curve.  Delaunay  triangulation  is  used  as  a  means  for  searching  and  optimization 
in  linking  edgels. 


5.2.1  Likelihoods 

The  method  by  Oliver  et  al.  [6]  uses  the  ML  decision  procedure  to  select  one  of  the  two 
hypotheses:  {one  distribution,  two  distributions).  In  other  words,  the  pixels  on  both  sides 
of  the  edgel  have  the  same  or  different  distribution. 

Gamma  distribution  for  L-look  SAR:  Pr(I)  =  {^)LJL~lf jrje~T 

In  our  case  (one-look):  Pr(I)  =  where  p,  is  mean  of  the  distribution. 

Take  n  pixels  from  each  side  of  the  edgel  and  compute  the  likelihood  of  the  two  hypotheses: 
•  Two  distributions  (two  distributions  characterized  by  ,ul  and  /z2): 


n  ^ 

Pr(/il,M2  |  data)  ~  —e 
i= i  M-1 


Ml 


n 


•n 


1  _i L 
— e  ^ 
M2 


(4) 


The  log-likelihood  for  two  distributions  is  as  follows: 

A2(/R, m2)  ~  — n[ln(/il)  +  “Y  +  Mm2)  +  ~]  (5) 

where  h  =  ~  E"=i  h,  h  =  £  E"=i  lj 
•  One  distribution  (characterized  by  /al): 
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Pr(n  1  |  data)  ~  —e 
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The  log-likelihood  for  one  distribution  is  as  follows: 


—  2re[ln(pl)  + 


h_- 

/jl' 


(7) 


where  1\  =  ^  £i=i  U 

5.2.2  Computing  A2  and  A 

Oliver  et  al.  [6|  use  h  and  i?  as  approximations  for  pi  and  p2.  Although  they  seem  to  be 
reasonable  approximations,  the  resulting  ML  decision  rule  have  very  little  discriminating 
power.  The  approximations  favor  the  one-distribution  hypothesis  and  becomes  ineffective 
with  low  SNR  typical  of  SAR  images:  in  our  case,  SNR  =  -f=rr  ~  L  where  ‘S'  = 

VaS+(rA- 

background  and  ‘N  =  shadow. 

In  our  implementation,  pi  and  p.2  from  precollected  statistics  are  used,  and  we  compare 
the  following  two  quantities  to  determine  which  hypothesis  is  more  likely: 


A2  =  max(  A2  (pi,  p2),  A2(p2,  pi))  (8) 

A1  =  mo,a:(A1(pl),  AJ(p2))  (9) 


i.e.. 


•  hypothesis  =  one  distribution,  if  A1  >  A2. 

•  hypothesis  =  two  distributions,  if  A2  >  A1. 


It  is  noted  that  Oliver  et  al.  [6]  find  edgel  location/orientation  by  maximizing  A2  while  we 
use  Binford-Chiang  edgel  operator  for  edge!  detection. 


5.2.3  Hierarchical  Linking 

Two  Delaunay  triangulations  are  used  in  the  linking  process  (Figure  17):  DTa,  Delaunay 
triangulation  of  all  edgels,  and  DTb,  Delaunay  triangulation  of  shadow-background 
boundary  edgels  (selected  using  ML  decision). 

In  the  first  linking  stage,  shadow-background  boundary  edgels  that  are  direct  neighbors  in 
DTa  are  connected  to  form  edgel  chains.  In  the  second  linking  stage,  end  points  of  edgel 
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Figure  16:  Shadow-Background  Boundary  Edgels 


(a)  DT°  (b)  DT'' 

Figure  17:  Delaunay  Triangulations 

chains  that  are  direct  neighbors  in  DT*  can  be  linked  to  form  a  boundary  curve.  Figure  18 
shows  the  results  of  stage  #1  and  stage  #2  linking. 

Figure  19  shows  a  detailed  example  of  linking  edgel  chains.  There  are  three  direct 
neighbors  for  the  edgel  p  in  DT*  (Figure  19(a)).  The  Delaunay  walk  in  DT°  is  used  to  find 
initial  paths  from  p  to  the  three  neighbors  in  DTb.  These  initial  paths  can  be  optimized 
with  respect  to  contrast  or  a  probability  quantity  to  get  the  paths  shown  in  Figure  19(b), 
and  the  best  path  can  be  selected  using,  for  example,  contrast. 


(a)  Edge]  chain?  (b)  A  bound¬ 

ary  curve 

Figure  18:  Hierarchical  Linking 
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(a)  Initia]  path?  (b)  Optimized  path? 


Figure  19:  Hierarchical  Linking 


6.  Leading  Surface  Estimation 


We  define  leading  surfaces  to  be  the  sides  of  the  vehicle  that  face  the  radar.  The  LLS 
usually  corresponds  to  one  of  the  two  sides  of  the  target  and  the  shorter  leading  surface 
(SLS)  corresponds  to  either  the  front  or  the  end  of  the  target.  Figure  20  shows  a  top  view 
illustration.  The  LLS  determines  target  azimuth  up  to  a  180-degree  flip.  The  role  of  LLS 
and  SLS  may  be  changed  in  the  case  of  occlusion  because  the  sides  may  appear  shorter 
than  the  end  (or  the  front). 

Leading  surfaces  can  be  used  in  several  subsystems  in  our  ATR  framework.  Also,  they  can 
be  used  to  reduce  computational  cost  and/or  improve  recognition  rate  in  different 
recognition  subsystems,  e.g..  statistical  classification,  template-based,  and  model-based. 

longer  leading  edge 


target 

shorter  leading  edge 


Figure  20:  Leading  Surfaces,  the  Sides  of  the  Vehicle  Facing  the  Radar 


6.1  March  1998 

We  combine  amplitude-weighted  ideogram  and  a  probabilitv-like  weighting  strategy  to  find 
the  best  leading  surface  (Figure  21  (a))  as  shown  in  expression  (10).  Amplitude-weighted 
ideogram  finds  surfaces  with  multiple  peaks  close  to  them,  and  probability-like  weighting 
rejects  surfaces  that  run  across  the  body  of  the  target.  Figure  21  (b)  shows  examples  of 
good  and  false  leading  surfaces. 


amplitude— weighted  ideogram 


pr obability—like  weighting 


(10) 


The  initial  result  of  our  leading  surface  estimator  exhibited  a  sinusoidal  bias,  as  shown  in 
Figure  22.  There  are  two  causes  of  this  sinusoidal  bias;  the  first  cause  is  inherent  from  the 
estimation  algorithm,  and  the  second  one  comes  from  SAR  imaging  geometry. 
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Figure  21:  (a)  Example  of  a  Leading  Surface  and  Ideogram,  and  (b)  Examples  of  Leading 
Surfaces 
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Figure  22:  Sinusoidal  Estimation  Bias 

Because  we  only  tried  to  estimate  the  LLS,  peaks  on  the  SLS  will  pull  the  estimated  LLS 
toward  the  SLS,  as  shown  in  Figure  23.  Our  solution  is  to  estimate  the  tvra  leading 
surfaces.  LLS  and  SLS,  together.  Peaks  are  assigned  to  either  the  LLS  or  the  SLS  before 
we  compute  ideogram. 

Since  SAR  images  are  the  projection  of  targets  onto  the  slant  plane,  the  dimension  in  the 
range  direction  will  be  foreshortened  by  a  factor  of  coscp.  as  shown  in  Figure  24.  As  a 
result,  the  angle  of  an  surface  will  be  increased  or  decreased  slightly.  To  remedy  this 
problem,  simple  geometric  correction  is  applied  to  compensate  for  the  foreshortening  before 
leading  surface  estimation  is  performed. 

Table  5  summarizes  the  current  state  of  performance  of  our  target  azimuth  estimation 
using  leading  surfaces.  We  characterize  the  performance  using  two  numbers;  error 
probability,  Pe,  is  the  number  of  estimation  mistakes  normalized  by  the  total  number  of 
estimations,  and  the  root-mean-square,  RMS,  error  measures  the  fluctuation  of  the  correct 
estimations.  We  define  an  estimation  with  an  error  less  than  10  degrees  to  be  correct.  The 
average  RMS  error  is  less  than  3  degrees,  which  is  slightly  better  than  human  observers. 
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Figure  23:  True  Leading  Surfaces  (Dashed  Lines)  and  Estimated  Leading  Surfaces  (Solid 
Lines) 


(b) 


Figure  24:  (a)  SAR  Imaging  Geometry  and  Foreshortening  (Side  View),  and  (b)  LLS  angle 
before  and  after  Projection 


Tab: 


e  5:  Performance  of  LS  Estimati on 


RMS  error 

Pe 

BTR70 

2.5° 

3/168 

BMP2 

2.8° 

2/231 

T72 

3.0° 

4/203 

6.2  November  1998 

We  have  made  progress  in  leading  surface  estimation.  Table  6  shows  our  previous  result  on 
leading  surface  estimation,  and  Table  7  shows  the  new  result. 

The  progress  is  a  result  of  a  few  algorithmic  improvements.  First,  LLS  hypotheses  with 


23 


Table  6:  Performance  of  LS  Estimation 

II  T3A/TQ  orrnr  !  P 


RMS  error 

P, 

BTR70 

2.47° 

8/233  «  3.43% 

BMP2 

2.79° 

5/231  »  2.16% 

T72 

3.10° 

7/231  »  3.29% 

Table  7: 

Performance  of  LS  Estimation 

RMS  error 

Pf 

BTR70 

2.20° 

2/233  w  0.86% 

BMP2 

2.66° 

1/231  «  0.43% 

T72 

3.28° 

6/231  «  2.60% 
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(a)  previous  result  (b)  new  result 

Figure  25:  Orientation  Estimation  Errors  of  a  BTR70 

longer  length  are  emphasized  because  they  are  less  sensitive  to  peak  position  fluctuations 
(note  that  LLS  hypotheses  are  generated  from  peaks).  Second,  the  relation  between 
projected  length  and  width  (i.e.,  length  >  width)  is  used  to  eliminate  most  90-degree 
estimation  errors  (Figure  25).  Third,  we  use  a  true  probability  instead  of  a  probability-like 
weighting  for  leading  surfaces;  i.e.,  we  use  the  following  metric  to  rank  leading  surface 
hypotheses: 


probability 


ideogram 


(ii) 
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where  Pr1,-  =  Pr(the  ith  peak  is  a  target  peak)  and  Pr°.j  =  Pr(the  ith  peak  is  outside  the 
leading  surface). 


6.3  June  1999 


The  following  equation  yields  a  probability  measure  for  leading: 


Pr (leading  edge) 


~  n  i] 

outside  peaks 


—  Pr(ith  peak  is  a  target  peak)  •  Pr(ith  peak  is  outside )j 


(12) 


where: 

Pr(ith  peak  is  a  target  peak)  —  J0a'  Rayleigh  dx ;  t  ns  ai  t 

rr2 

Pr(ith  peak  is  outside )  =  /0,li  dx;  t  ns  Hi  | 

Compensate  according  to  the  number  of  peaks  outside:  [Pr (leading  edge)]*™*77*7  ',ca,ts 


6.4  Alternative  Leading  Surface  Hypotheses 

Leading  surfaces  define  a  rectangular  box  which  gives  target  length,  width,  and  orientation 
estimates  independent  of  target  class.  In  our  previous  results  (Table  8),  orientation 
estimation  errors  greater  than  10  degrees  are  encountered  1.3  percent  of  the  time  (i.e.,  Pe 
=  1.3  percent)  and  the  RMS  error  of  the  correct  estimations  is  less  than  3  degrees  (i.e., 
slightly  superhuman).  In  order  to  address  this  1.3  percent  estimation  error,  we  generate 
alternative  leading  surface  hypotheses  by  dropping  peaks  on  the  front  convex  hull  of  the 
target  peaks,  where  the  front  convex  hull  is  the  half  convex  hull  facing  the  radar. 


Table  8:  Performance  Achieved  by  Selecting  Single  Hypotheses 


RMS  error 

P, 

BTR70 

2.20° 

2/233  «  0.86% 

BMP2 

2.66° 

1/231  »  0.43% 

T72 

3.28° 

6/231  «  2.60% 

Figure  26(b)  shows  the  leading  surface  hypotheses  generated  from  the  front  convex  hull 
shown  in  Figure  26(a).  Because  the  front  convex  hull  can  be  inaccurate  due  to  nontarget 
peaks  outside  the  leading  surfaces  (at  least  one  nontarget  peak  will  be  encountered  20 
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(b)  Hypotheses 


(a)  Front  Convex  Hull 


(c)  Front  Convex  Hull 


Figure  26:  Alternative  Leading  Surface  Hypotheses 


percent  of  the  time),  peaks  on  the  front  convex  hull  are  dropped  to  get  a  new  front  convex 
hull  and  a  new  set  of  leading  surface  hypotheses,  as  shown  in  Figure  26(c)  and  (d), 
respectively.  Linear  extension  of  pairs  of  peaks  (from  the  front  convex  hull)  in  the 
Delaunay  triangulation  of  all  target  peaks  are  used  as  leading  surface  hypotheses.  Notice 
that  peaks  dropped  are  still  used  for  the  evaluation  of  leading  surface  hypotheses.  Also,  by 
using  leading  surface  hypotheses  generated  from  the  front  convex  hulls,  a  10  to  1  reduction 
in  hypotheses  number  (compared  to  using  all  pairs  of  peaks)  is  achieved. 


7.  Target  Indexing 


The  target  indexing  module  selects  from  the  target  database  a  few  candidate  classes  for 
matching;  in  other  words,  exhaustive  matching  is  avoided  with  target  indexing.  There  are 
two  important  requirements  that  the  target  indexing  module  must  satisfy.  First,  the 
candidates  must  include  the  correct  target  classes.  Second,  this  indexing  step  must  be 
quick. 


7.1  The  Target  Database 

Each  target  class  has  100  images  in  the  target  database:  these  images  span  the  360  degrees 
evenly.  Peak  detection,  segmentation,  and  leading  surface  estimation  are  applied  to  each 
database  image  to  obtain  image  features  and  parameters. 

All  of  the  database  images  have  an  17  degrees  depression  angle.  For  each  target  azimuth 
(orientation),  a  pointer  table  is  created  for  all  target  classes.  Pointer  tables  are  explained 
next. 


Figure  27:  Target  Database 


7.2  Target  Indexing 

Pointer  tables  are  used  for  target  indexing.  Each  entry  in  a  pointer  table  contains  a  list  of 
pointers  pointing  to  peaks  of  target  classes.  For  example,  the  shaded  cell  in  Figure  28(a) 
has  a  pointer  to  peak  20  of  the  class  BTR70  with  a  score  of  0.95.  Also,  it  has  another 
pointer  to  peak  17  of  the  class  T72  with  a  score  of  0.90.  The  score  can  be  a  probability 
quantity. 
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When  an  unknown  target  is  encountered,  the  table  entries  hit  by  the  peaks  of  the  unknown 
target  are  used  for  sorting  out  candidate  target  classes  quickly  (Figure  28(b)).  For 
example,  the  total  score  for  the  BTR70  class  can  be  computed  by  summing  all  of  the  scores 
over  all  table  entries  hit  by  the  unknown  target.  The  same  procedure  is  used  for  other 
target  classes.  This  step  is  quick  because  it  involves  only  a  table  lookup  and  addition  of 
scores.  More  importantly,  the  indexing  procedure  uses  positive  evidence  for  finding 
candidate  classes.  This  can  be  useful  under  the  EOCs. 

It  is  worth  pointing  out  that  the  axes  used  here  are  the  leading  surfaces  from  the  generic 
vehicle  model.  Also,  higher  order  target  features  may  be  used  to  help  the  discrimination  of 
true  target  classes  against  false  target  classes. 


Figure  29  illustrates  the  entire  indexing  process.  With  the  azimuth  (orientation) 
information  from  the  estimated  generic  vehicle  model,  the  pointer  table  with  the  closest 
azimuth  is  selected.  And  a  few  candidate  target  classes  are  found  with  the  pointer  table. 
Azimuth-dependent  pointer  tables  are  necessary  to  address  the  high  variability  of  SAR 
imagery. 
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(a)  Creating  pointer  tables 


Lj  =  table  entry  hit  by  the 
unknown  target 


(b)  Table  lookup 


Figure  28:  Target  Indexing 


Figure  29:  The  Entire  Indexing  Process 


Best  match: 
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8.  Image  Alignment 


Image  alignment  is  essential  in  our  study  of  persistent  scatt  ering.  Also,  template  images 
and  the  input  test  image  must  be  aligned  for  recognition.  We  seek  the  best  rigid  body 
transformation  between  the  input  test  image  and  a  given  template  image  such  that  a 
preselected  metric  is  optimized.  Affine  transformation  is  overly  general  for  the  application 
(e.g.,  it  allows  shearing  of  images)  and  is  therefore  not  used.  The  alignment  procedure  has 
two  steps.  First,  peaks  are  used  to  generate  initial  alignment  hypotheses;  the  hypotheses 
are  ranked  by  their  values  of  the  preselected  metric.  Second,  we  refine  the  best  or  the  first 
few  best  initial  alignment  hypotheses  by  an  analytical  formulation.  Both  hypothesis 
generation  and  the  analytical  formulation  aim  at  efficient  image  alignment. 


8.1  Initial  Alignment 

Let  Su  =  {{ux,uy.Au)}  and  Sv  =  {{vx,vy,Av)}  be  the  set  of  template  image  peaks  and 
test  image  peaks  respectively.  Notice  that  each  peak  is  characterized  by  its  position  and 
amplitude.  We  seek  a  rigid  body  transformation  (T0)  that  optimizes  a  preselected  metric 

(Equation  (13)).  For  aligning  peak  images,  we  use  1  =  Ei(Ai  +  Avi)e  or 
_ A 

1  =  T'i  i  ,AAi+AvA  i  c  ^  •  where  a  ~  1  pixel  is  used. 


T0  =  ArgMaxr  I{Su,Sv,T) 
=  ArgMaxr  I {Su,S'v) 


(13) 


where 


'  9  ' 

r 

Vx 

KJ 

a 

vx 

-A 

b 

Vy 

cos(9 )  - 

sin{9 ) 

a 

Vx 

Vy 

1 

sin(9) 

cos{9) 

b  _ 

(14) 


Initial  alignment  gives  an  initial  guess  of  the  parameters  a  and  b,  i.e.,  the  x-  and 
y-translations.  Since  the  target  orientation  of  the  template  image  is  known  and  the  target 
orientation  of  the  input  test  image  can  be  estimated  from  the  leading  surfaces,  the 
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(a)  Use  peaks  for  hypothesis  generation 


Figure  30:  Alignment  Hypothesis  Generation 

template  image  can  be  rotated  such  that  the  two  images  have  the  same  target  orientation, 
and  the  best  initial  guess  of  9  is  zero. 

The  two  leading  surfaces  together  form  an  intrinsic  target  coordinate  frame  that  provides 
quasi-invariant  peak  coordinates  under  radar  viewing  parameter  variations.  Ideally,  two 
peaks,  one  from  each  of  the  two  images  being  aligned,  that  have  roughly  the  same 
coordinates  are  sufficient  for  the  alignment  task.  However,  some  peaks  may  not  have  the 
corresponding  peaks  due  to  the  great  variability  of  SAR  imagery  (Figure  30)  or  false 
segmentation.  Therefore,  we  choose  to  generate  alignment  hypotheses  using  the  five 
strongest  peaks  from  the  input  test  image  and  select  the  best  hypotheses  with  largest 
overlap,  where  overlap  is  measured  by  weighted-amplitude  ideogram  defined  by 

d\ 

J  =  Y.i(Aui  +  Avi)e~^ .  Note  that  strong  peaks  are  preferred  because  they  are  more  likely 
to  be  persistent,  and  therefore  their  corresponding  peaks  in  the  template  images  are  more 
likely  to  be  found.  Figure  31  shows  the  initial  (translational;  0  =  0)  alignment  hypotheses 
of  two  images. 

Delaunay  walk  (Section  )  is  used  to  enable  efficient  searches  for  corresponding  peaks.  Note 
that  peak  pairs  have  to  be  formed  before  1  =  '£i(AUi  +  Avi)e~ can  be  computed.  This  is 
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done  by  local  searches  in  the  Delaunay  tri angulations  of  the  test  peaks  and  the  template 
peaks.  The  worst  case  search  time  is  proportional  to  the  diameters  of  the  triangulations. 


Figure  31:  Initial  Alignment  Hypotheses 


8.2  Refined  Alignment 


Given  the  sets  Sv  =  {(ux,uy,Au)}  and  Sv  =  {(ux,  vy,  -4„)},  1  is  a  function  of  the 
transformation  T,  i.e.,  I  =  I(T).  The  conventional  technique  for  refining  alignment  is  to 
shift  and  rotate  Sv  =  {{vx,  vy,  A)}  by  a  small  amount  (i.e.,  sampling)  until  I(T)  reaches  a 
maximum.  Instead  of  using  this  computationally  expensive  brute  force  search,  we  adopt  a 
Newton- Raphson  [7]  iterative  optimization: 


V/(T)  «  V/(To)  +  H(T0)  •  5T 
=  H(T0)-6T 

«  H{T)  ■  6T  (15) 


where  V  denotes  gradient  operator  and  H  is  a  Hessian  matrix.  Therefore, 


T0  «  T  -  H(T)~l  ■  V/(T) 


(16) 


In  iterative  form 
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T,+,  ~T,-  HIT,)'1  ■  VI(T,) 


(17) 


In  expanded  form: 
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(a)  Initial  alignment,  9  =  0°,  a=-2.8920  pixels,  b=-1.9128 
pixels 


40 


90 


40  50  60  70  80  90 

(b)  Refined  alignment,  9  —  -5.6721°,  a=-2.3513  pixels,  b=- 
1.4737  pixels 

Figure  32:  Refining  Alignment  Hypothesis 


Iterative  Newton- Raphson  searches  yield  good  results  because  I(T)  is  a  well-behaved 
function,  and  our  best  initial  alignment  hypothesis  is  often  very  close  to  the  maximum. 


35 


The  method  usually  takes  4  to  5  steps  to  converge.  Figure  32  shows  a  comparison  of  the 
initial  and  refined  alignment.  There  are  also  other  methods  for  optimization;  see  [7]  for  a 
good  account  of  references. 
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9.  Delaunay  Walk 


Delaunay  triangulation  is  used  heavily  as  a  means  for  spatial  indexing,  segmentation, 
search,  and  optimization  in  our  SAR  ATR.  system.  In  this  section,  we  present  an  algorithm 
(Delaunay  walk)  for  finding  the  closest  point  (peak)  with  Delaunay  t.ri angulation.  An 
immediate  application  is  the  association  of  individual  peaks  in  a  test  image  to  peaks  in  a 
model  image,  where  the  association  is  done  in  the  sense  of  closest  point.  We  also  provide  a 
theorem  as  the  basis  for  the  algorithm  in  this  section. 


9.1  Algorithm 

This  section  describes  an  algorithm  for  finding  the  closest  point  q*  €  Q  for  a  given  p  by  * 

using  Delaunay  triangulation. 

To  find  q*  we  start  with  a  randomly  selected  point,  q  6  Q.  and  move  to  a  neighboring 
point  which  is  the  closest  to  p  among  q's  neighbors,  defined  by  VT{Q).  This  locally  greedy 
search  process  is  continued  indefinitely  until  no  neighbors  of  q  are  closer  to  p  than  q.  It  is 
proved  (Section  )  that  q*  can  always  be  found  using  this  Delaunay  walk  technique  with  a 
random  start,  and  we  call  this  unique  property  monotonicity  of  Delaunay  triangulation  in 
analogous  to  the  role  of  monotonic  functions  in  optimization  problems.  The  technique  is 
even  more  efficient  for  the  problem  of  finding  q*  €  Q  for  each  p  €  V,  assuming  V  and  Q 
have  the  same  uniform  spatial  densities.  This  increased  efficiency  is  a  direct  result  of  using 
both  Delaunay  triangulations,  VT{T)  and  T>T(Q). 


Walking  on  VT{Q)  ... 


qc  =  a  random  start  point  6  Q: 

dc  =  \\p-Qc\\; 
continue  =  yes; 
while  (continue  ==  yes) 

AfB  =  neighbors^); 

q*  =  argmin\\p  —  <?||,  q  e  A fB\ 

£  =  Up  —  9*11; 
if  (d*  <  dc ) 

Qc  =  Q *; 

dc  =  d  , 

else 

continue  =  no; 
end 
end 
Q*  =  Qc- 
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Figure  33  shows  an  example  of  Delaunay  walk.  We  initialize  qc  to  q~  and  the  algorithm 
selects  the  path  q7  -t  q2  94  q-0  to  reach  q5,  which  is  the  closest  to  px. 


Figure  33:  A  Delaunay  Walk  Example 

Notice  that  each  q  must  maintain  a  list  of  its  neighbors  in  VT{Q).  The  set  of  all  neighbor 
lists  is  denoted  by  J\fB{Q).  Once  VT{Q)  and  A fB{Q)  are  computed,  the  worst-case  search 
time  is  proportional  to  a  diameter  of  VT{Q).  Moreover,  except  for  the  first  search,  the 
worst-case  search  can  practically  be  avoided  by  using  VT{V)  and  J\fB{V).  For  example,  to 
find  the  closest  q*  e  Q  for  p2,  qc  can  be  initialized  to  q5  since  p2  is  a  neighbor  of  px.  In 
other  words,  qc  is  a  point  very  close  to  the  actual  destination  q*,  which  is  <j6  in  this  case. 


9.2  Experiments 

The  following  experiments  are  done  with  Matlab  and  C  on  a  Pentium  II  300-MHz  machine. 
Two  point  sets  V  and  Q  are  generated  randomly  with  a  uniform  distribution;  V  and  Q  have 
the  same  number  of  points,  i.e.,  \V\  =  \Q\  =  n.  We  find  the  closest  q*  E  Q  for  each  peV 
with  n  brute  force  searches  and  with  n  Delaunay  walks.  The  results  from  both  methods  are 
cross-verified.  Note  that  a  random  start  is  used  only  for  the  first  Delaunay  walk  (Section  ). 

Figure  34  shows  the  result  of  the  first  experiment  with  Matlab.  The  CPU  time  for 
Delaunay  walk  includes  the  time  for  computing  VT{V),  VT{Q ),  NB{ V)  and  AfB(Q).  The 
efficiency  of  Delaunay  walk  is  evident  with  larger  point  sets. 

In  some  applications  [3],  the  data  structures  VT{V),  VT{Q),  NB{V)  and  NB{Q)  are 
computed  once  and  used  many  times.  Therefore,  it  is  appropriate  to  compare  only  the 
search  time  of  Delaunay  walk  and  brute  force  search  as  shown  in  Figure  35.  This 
experiment  is  done  in  C  language  to  optimize  the  performance.  The  plot  shows  that  the 
Delaunay  walk  outperforms  brute  force  search  when  n  is  eight  or  greater. 
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Figure  34:  Delaunay  Walk  versus  Brute  Force  Search  (Matlab) 


Figure  35:  Delaunay  Walk  versus  Brute  Force  Search  (C;  Search  Time  Only) 


9.3  Theorem  and  Proof 

In  this  section  we  present  a  theorem  as  the  basis  for  Delaunay  walk.  This  theorem  ensures 
q*  to  be  found  using  Delaunay  walk  with  a  random  start.  A  proof  by  contradiction  is  also 
provided  below1. 

Theorem: 
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Given  a  finite  point  set  Q  C  7Z2.  Let  VT{Q)  denote  the  Delaunay  triangulation  of  Q  and 
MB(q)  the  neighbors  of  q  e  Q  within  VT{Q).  Then  for  p  G  H2,  we  have  the  following 
implication: 

\\q* -p\\  <  \\q~p\iv  qeMB{q*) 

=>  W  -p||  <  h~p\Vy  q  £  Q  (19) 

In  words,  if  q*  is  as  close  to  p  as  any  of  q*'s  neighbors,  then  q*  is  a  closest  neighbor  of  p 
over  the  entire  set  Q. 

Proof: 

Let  Cone^*)  denote  the  set  of  all  A 's  in  VT{Q)  which  have  q ’  as  one  of  the  vertices: 
Cone(g*)  =  {Aa6c  G  VT{Q )  :  q*  =  a,  or  q*  =  b,  or  q *  =  c}. 

There  are  two  cases  for  Cone(<?*): 

Case  1.  q*  €  CH(Q); 

Case  2.  q*  CH(Q),  where  CHiQ)  denotes  the  convex  hull  of  Q. 

Now  let’s  proceed  with  an  assumption  that  q*  is  as  close  to  p  as  any  of  its  neighbors  and 
3  q  $  MB(q*)  such  that  ||g  -  p\\  <  ||<f  -  p||. 


Figure  36:  Geometry  for  the  Assumption  (Case  1) 

Without  loss  of  generality,  Figure  36  shows  the  geometry  for  the  assumption  under  Case  1. 
This  is,  however,  not  a  valid  case  because  the  existence  of  q  contradicts  the  fact  that 
q*  E  CH(Q).  To  be  a  valid  case,  the  angle  between  the  two  neighboring  edges,  q*qk{p)  and 
9*9fc+i(p)  )  must  be  smaller  than  180  degrees. 

Figure  37(a)  and  (b)  fall  into  the  category  of  Case  2,  and  they  all  satisfy  the  constraint 

lqk(p)q* qk+\(P)  <  180°.  But  Figure  37(a)  is  also  not  possible  because  q  is  inside 

&q* qk{p)qk+\(p)i  and  q*q  must  be  present,  which  contradicts  our  assumption,  q  £  NB{q*). 

The  four  points,  q *,  qk(P),  <?-  and  qk+\(p)-,  form  a  convex  quadrilateral  in  Figure  37(b).  There 
are  two  ways  to  triangulate  these  four  points,  as  demonstrated  in  Figure  38(a)  and  (b).  We 
show  below  that  the  triangulations  shown  in  Figure  38(a)  do  not  make  a  legal  Delaunay 
triangulation,  and  the  proof  is  completed  since  38(b)  contradicts  the  assumption, 
UMB{q»). 
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(a)  (b) 

Figure  38:  Two  Possible  Triangulations  of  a  Convex  Quadrilateral 


<r  <f  <r 


(a)  (b)  (c) 

Figure  39:  Geometry  of  the  Proof 


Figure  39(a)  shows  the  geometry  of  the  problem.  C\  is  the  circle  determined  by  q*  and  p, 
and  C'l  is  the  circle  determined  by  q* ,  q^),  and  <?*+!(?>)•  Note  that  points  qk(p),  and  qk+i{p) 
are  outside  Cl  by  the  assumption.  If  C\  and  C?  intersect  only  at  q *,  then  is  completely 

inside  C 2.  If  C\  and  C2  intersect  at  two  points,  q*  and  r.  then  the  hatched  region  in  Figure 
39(b)  is  inside  C'2. 

The  above  statement  may  be  trivial.  But  in  order  to  be  more  rigorous,  we  can  use  Thales’ 
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theorem  [8]  to  prove  this  is  indeed  the  case.  Because  Cj  and  C2  share  the  same  cord  rq*, 
and  qk+i(p)  is  outside  C\,  by  Thales’  theorem  we  have 

Lrsq*  >  Zr<?fc+1(p)C/’  =  Lrtq *,  V(.s  6  C\,t  €  C2)  on  the  lower  side  (the  side  containing  p)  of 
rq*.  Therefore;  the  hatched  region,  is  inside  C2.  Since  our  q  is  always  in  the  hatched 
region,  which  is  completely  inside  C2,  qk(P)qk+i(P)  is  an  illegal  edge  because  the  circle  C2  is 
not  site-free  [8].  Furthermore,  because  q %  qk(P) ,  <1-  and  qk+ 1(p)  form  a  convex  quadrilateral, 
exactly  one  of  qi^qk+Up)  and  q*q  is  legal  in  general  situations,  i.e.,  when  the  four  points 
are  not  on  a  common  circle  [8].  In  conclusion.  Figure  38(a)  is  not  possible  and  q*q  must  be 
present. 
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10.  Target  Matching 


We  use  synthesized  intensity  images  (synthesized  from  peaks)  and  a  probability  matching 
metric  for  both  image  alignment  and  target  recognition.  Our  preliminary  experiments 
achieved  good  recognition  rates  for  test  images  with  different  depression  angles. 
Recognition  of  targets  with  different  depression  angle  and  configuration  from  the  training 
set  is  also  tested  for  T72. 


10.1  Matching  Metrics 

Let  U,  Du  =  {{i,j)\Uij  >  0}  and  V,  Dv  =  {(i,  j ) | >  0}  be  the  synthesized  intensity 
image  of  the  template  and  test  image  respectively.  W’here  L Jij  =  Efc  > 

Vij  =  Efc  Avke  ^ and  a  ~  1  pixel. 

A  natural  deterministic  matching  metric  is  the  Euclidean  distance  of  the  two  image  vectors 
in  multidimensions: 


'  E  (Vii-Uij)2,  D  =  DvUDu 

(id)eD 


(20) 


Each  pixel  intensity  difference.  -  Uij,  is  modeled  simply  as  a  zero  mean  Gaussian 
random  variable  with  a  standard  deviation  depending  on  the  mean  pixel  intensity;  i.e., 

Xij  =  -  Uij  ~  N(0,oi:j),  where  a{j  =  atj(Vij+2Ujl).  We  use  a  linear  model  for  <xy, 

0ij  -  Tj  ■  +  e.  a  normalized  measure  is  Y{j  =  ^  ~  77(0, 1).  Therefore,  a  normalized 

distance  measure  has  the  following  form: 


E  %  D  =  Dy\JDv 

(i,j)eD 


(21) 


Note  that  y/T,(i,j)€D  Yij  or  E(*j)€JD  Yij  is  a  probability  measure  because 

Pr (distance  >  ■sj'L(i,j)eD  Ej)  can  ])G  used  as  a  matching  metric:  the  larger  the  probability, 
the  more  likely  the  match.  Ideally,  if  the  template  image  and  the  test  image  come  from  the 
same  target,  then  D  &  Dv  and  under  i.i.d.  assumption  E(i,j)eo  Yij  have  a  Chi-square 
distribution  with  \DV\  (size  of  Dv)  degrees  of  freedom.  In  order  to  take  into  account  the 
effect  of  \Du\,  the  following  matching  metric  is  used: 
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1 


(22) 


—  (zij  ~  !)  — >  z  ~  N(0,  1)>  where  Zy  =  Y? 

2 VI  A/ 1  (i,j)eD 


Central  Limit  Theorem  is  used  in  Equation  (22);  it  is  applicable  since  \DV\  ~  500.  Note 
also  that  mean(Zij)  =  1  and  var(Zij)  =  2. 


10.2  Experiments  and  Results 


The  target  database  contains  a  total  of  300  template  images;  100  for  each  of  the  3  targets: 
BTR70,  BMP2,  and  T72.  The  template  images  have  a  17-degree  depression  angle.  They 
are  chosen  to  span  evenly  the  0  to  360  degrees  target  orientation  range.  For  each  test 
image,  we  perform  peak  detection,  segmentation,  and  leading  surface  estimation,  in  that 
order.  Only  the  test  images  with  correctly  estimated  orientation  (i.e.,  error  10  degrees)  are 
used  in  the  recognition  experiments. 

Estimated  target  orientation,  a.  of  the  test  image  is  used  to  select  N  images  (from  each 
target  class)  that  have  the  closest  orientation  to  a  as  hypotheses.  N  images  (from  each 
target  class)  that  have  the  closest  orientation  to  a  ±  180°  are  also  selected  as  180-degree 
alternative  hypotheses.  Alternative  hypotheses  are  needed  because  leading  surfaces  only 
determine  target  orientation  up  to  a  180-degree  flip.  Expression  (22)  is  a  measure  of 
matching  disparity.  The  hypothesis  that  has  the  lowest  matching  disparity  determines  the 
class  name  and  orientation  of  the  test  target.  Note  that  N=3  is  used  in  our  experiments. 

Expression  (22)  in  is  not  only  used  as  a  target  matching  metric,  but  also  as  an  image 
alignment  metric.  In  other  words,  expression  (22)  is  optimized  for  a  given  image  pair 
during  the  process  of  image  alignment.  The  metric  is  a  function  of  the  transformation  T; 
therefore,  the  aligning  method  in  Section  is  applicable  here.  However,  aligning  with 
expression  (22)  is  computationally  more  expensive  compared  to  aligning  with  ideogram. 

The  confusion  matrix  for  test  images  with  depression  angle  =  17  degrees  is  shown  in  Table 
9.  For  the  BTR70,  there  is  a  total  of  118  test  images;  116  of  them  are  correctly  recognized 
as  BTR70,  and  2  of  them  are  recognized  as  180-degree  alternative  BTR70  (i.e.,  wrong 
pose).  Figure  40  shows  the  matching  disparities  of  the  hypotheses  as  a  function  of  target 
orientation.  The  disparity  of  the  BTR70  hypothesis  is  used  as  a  reference.  The  overall 
recognition  rate  for  the  three  targets  is  98.9  percent.  Table  10  shows  the  confusion  matrix 
for  test  images  with  depression  angle  =  15  degrees.  The  overall  recognition  rate  for  the 
three  targets  is  95.8  percent. 
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Table  9:  Test  I 

Depression 

Angle  = 

17° 

BTR70 

BMP  2 

T72 

BTR70  (98.3%) 

116/2 

0/0 

0/0 

BMP2  (98.4%) 

1/1 

121/0 

0/0 

T72  (100%) 

0/0 

0/0 

116/0 

Table  10:  Test 

Depression  Angle  = 

=  15" 

BTR70 

BMP2 

T72 

BTR70  (95.2%) 

178/9 

0/0 

0/0 

BMP2  (96.2%) 

1/0 

178/3 

3/0 

T72  (96.1%) 

0/0 

5/0 

171/2 
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List  of  Acronyms 

ACRONYM  DESCRIPTION 

ATR  Automatic  target  recognition 

edgels  Edge  elements 

EOCs  Extended  operating  conditions 

ISAR  Inverse  synthetic  aperture  radar 

LLS  Longer  leading  surface 

SLS  Shorter  leading  surface 

MSTAR  Moving  and  stationary  target  acquisition  and  recognition 

ROl  Region  of  interest 

SAR  Synthetic  aperture  radar 
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